DOCOMENT EESOMS 



ED 086 228 



IR 000 080 



AOTHOR 
TITLE 

INSTITOTION 

SPONS AGENCY 

PUB DATE 
CONTEACT 
NOTE 

EDES PEICE 
DESCEIPTOES 

IDENTIFIEES 

ABSTEACT 

Since indexing systems concentrate 
information content of materials and not upon their 
instructional media centers (IMC) can use one syste 
Content descriptors can be selected from a thesauru 
terms^ from the title of the material, or from an a 
content. The first of these three methods is the mo 
for dealing with multiple forms of media; the Sears 
Headings ar I Subject Headings used in the Dictionar 
Library of Congress are the most commonly used thes 
recommended that the main file index of the IMC con 
for all materials and that in-depth indexing be pro 
of several descriptors for each item. Lastly, catal 
should be employed wherever possible, (PB) 



Abbott, George L, 

Indexing for the Growing Instructional Media 
Center, 

Stanford Dniv,, Calif, EEIC Clearinghouse on 
Educational Media and Technology, 
National Inst, of Education (DHEW) , Washington, 
D,C, 
Sep 73 

NE-C-00-4-0027 
25p, 

MF-$0,65 HC-$3,29 

Classification; Filing; *Indexing; *Instructional 
Materials Centers; Instructional Media; Library 
Science; State of the Art Reviews 
Library of Congress; Sears List of Subject 
Headings 



upon the 

form, 
m for all media , 
s of accepted 
nalysis of the 
st satisfactory 

List of Subject 
y Catalogs of the 
auri. It is 
tain all entries 
vided through use 
og card files 



FILMED FROM BEST AVAILABLE COPY 



INDEXING FOR THE GROWING INSTRUCTIONAL MEDIA CENTER 



George L. Abbott 
September 1973 



i 



TABLE OF CONTENT 



Introduction 

Media as Information 

/Definition of Indexing 

Overview, of Indexing Systems 

Applicability of Existing Indexing Systems 
to Media 

Summary 

Bibliography 

Selected List. of Thesauri and Media Indexes 



INTR9DUCTI0N 



Instructional Media Centers are being confronted with increasing 
collections of media in many new formats. It some^times appears as 
though a new raf^.dia format or some variation of an old format is in- 
troduced every week. These new formats raise many questions on their 
handling in the Instructional Media Center. The specific question 
we will treat here is: how to index these media for ease of retrieval ■ 
by users. 



MEDIA AS INFORMATION 



All media exist for the communication of information or ideas. 
The media, of and by itself, is not the content of this information 
but merely a carrier - a road on v/hich to transport the information^ 

Communication models have been set up by many educators to ex- 
plore the ways information is transmitted from one individual to 
another. The simplest model consists of three elements. 




CHANNEL 



^ RECEIVER 



The example, listening to an audio tape can be used to explain these 
elements. You, the listener, are the receiver; the media, audio tape, 
is the channel; and the words or information conveyed originate from 
the sender. The tape itself does not in any way alter the factual 
information presented and that information could also be conveyed by 
printed"~or"^other means. Although the information or ideas conveyed 
are independent of the media, the media may in some cases allow for 
giving interpretation to the information, by voice expression for 
example. This in no way alters the factual base of the information. 

Since indexing systems concentrate on the information content of 
materials, those systems used for printed materials can work equally 
as well for films, or tapes, or slides, or any form of medra. In all 
but the very largest collections of materials general indexing systems 



will suffice for all media. In extremely large or comprehensive 
subject media colleQtions, special indexing systems may need to 
be developed/ but their development would parallel the present 
development of special indexing systems and classification schemes 
for collections of medical , legal and architectural materials. 
These special indexing systems, such as MeSH (Medical Subject 
Headings), are applicable to all media. 



DEFINITION OF INDEXING 

According to Ms. Hilda Feinberg in her book Title Derivative . 

Indexing Techniques: 

Indexing consists of indicating the subject 
content of an item of information by assign- 
ing one or more terms to the document so as 
to characterize it. The v/ord "term'' is used 
broadly to include any form of class, sub- 
class, subject heading, Uniterm, compound 
word or phrase. 

Indexing, therefore, is that specific activity of identifying the 
subject or informational content of an item. This is in contrast 
to the general concept of cataloging where descriptive information ' 
is the chief attribute. There are countless cataloging manual s for 
non-print media. Many states, school districts and associations have 
published one. These cataloging manuals outline in detail how to 
describe a physical item including the appropriate media designator, 
lengths, speeds, sizes and other identifying factors. For the most 
part these manuals concern themselves with the physical attributes 
of the item and not the informational or subject content. Very little 
is said in most of these manuals about subject content. Usually a one 
sentence statement is made relating to the subject content such as 
"use Sears subject headings" or "classify in Dewey". ^ 

These general statements may seem insufficient and 
lacking in detail, but cataloging manuals concentrate on 
access* by physical form of media, rather than access by subject 
or content. 
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A concept frequently used in indexing is '^entry point.*' 
This concept can best be explained by examining a user search. When 
a user chooses a term and looks for that term in a completed index, 
that term becomes the "entry point' or "access point" — the door into . 
the index to secure the information. The terms or words used to look 
up information are entry points. 

While the main topic discussed here is media indexing from a 
subject point of view, indexing on grade level, date of production, 
or other significant information can be handled through an added entry 
process, to be discussed later. 



OVERVIEW OF INDEXING SYSTEMS 

The number of existing indexing systems extends in the hundreds: 
.some are general information systems, most , are highly specific. The 
basic difference in these systems usually /^f the list of terms 
and the method for arriving at these terms. For highly specialized 
and exhaustive collections in a given subject field special thesauri 
or word lists are constructed to allow for retrieval of documents at 
an acceptable level. In any indexing system the acceptable level in 
retrieving documents is that point where the maximum number of doc- 
uments retrieved with the minimum number of documents inappropriate 
to your request. All indexing systems are based on assigning a word 
or words to a subject content of an item or in the case of a classified 
index, assigning a code or classification number which can be converted 
to subject terms. If an item consists of multiple concepts, more than 
one term would be assigned. 

Determination of the retrieval term can be from three sources, Ovne'^^^^^^ 
thesaurus, or approved list of terms to be used in describing items in 
a given collection.. This thesaurus method includes classification lists 
where numbers are used as terms. A second method is the KWIC (Keyword 
in context) or KWOC (Keyword out of context) index where the terms con- 
sist of v/ords taken directly from the title of the item. The third, a 
more sophisticated, mechanical system is automatic indexing, where the 
terms are derived from the contents of the document by analyzing the . 
frequency of occurence of words . in the text. * . . These 

words are then used as descriptors. 



The thesaurus is the most common and simplest form of indexing 
for small to medium general collections. By this method, terms are- 
assigned from a preprescribed list of terjDS to indicate the subject 
content of the item. This master list of terms or thesaurus is con- 
stantly bC'ing. updated to include new terms and relationships. Cross 
references from similar terms are usually included,. The advantage 
of the thesaurus approach is the existence of a definite list of terms. 
The student is able to refer to a finite list of entry points for his 
search. He must, however, translate his request into terms from this 
list. 

Classification is related to thesaurus indexing. The classification 

scheme is basically a thesaurus withinumbers or letters assigned to 
ea c h 

term. The class number then becomes a compound entry representing 
many terms related to the contents of the item. Since'^classif ication 
schemes combine many terms in one class number, they provide a single 
entry point usually under the most prevalent subject. Additional entry 
points are generally not available. 

Indexing by the KWIC or KWOC method is based solely on the title of 
the item. The index is prepared by providing entry points under eadh 
significant word. in the title. The number of entries is determined by 
the number of significant words in the title. To do an exhaustive 
search on any topic all synonyms would need to be checked since the 
documents are not grouped under specific terms. There are no cross 
references in a KWIC/KWOC index. An advantage to this method is no 
intermediate step of referring to a thesaurus is, necessary. The index 
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is in a natural language and the terms used are those used by the 
author. This index can most easily be prepared mechanically by 
computer by inputing the titles of all documents to be indexed. * • 
As an example, a KWIC index for the two titles "An Inquiry into 
the Uses of Instructional Technology" and "Educational Technology 
in the Seventies" is as follows. 

seventies EDUCATIONAL technology in the 

technology An INQUIRY into the uses of instructional 

n inquiry into the uses of INSTRUCTIONAL technology A 
ucationaltechnology in the SEVENTIES Ed 

Educational TECHNOLOGY in the seventies 
the uses of instructional TECHNOLOGY An inquiry into 

An inquiry into the USES of instructional technology ^ 

As mentioned above, KWIC indexes are not cross-referenced. Therefore, 
one would have to check related terms, such as audio-visual media, 
etc, to locate additional references, and both instructional technology 
and educational technology would need to be used to retrieve information. 

A KWOC index is constructed similarly to a KWIC index except the 

keyword is repeated out of context usually in a column at the beginning 

of each line. The first two entries from our previous example would 

appear as follows. 

EDUCATIONAL Educational technonlogy in the seventies 

INQUIRY An inquiry into the uses of instructional technology 

Another automated but more sophisticated method is automatic indexing. 
This method also uses the author's terms directly but the entire document 
is used in determining the index terms. Terms are used as entry points 
if they occur with a predetermined frequency in the document. The dis- 
advantage of &his method is the entire document must be entered into the 
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computer. Optical scanning and other technological improvements may aid 
in this input. 

In addition to the variance in indexing methods for determining 
entry points, differences also occur in the methods of file organization. 
Once the word list has been established there are several methods of file 
o'i^ganization for storing the completed index. Tv/o of the most common 
types are hhe catalog card file and the printed book catalog. Other 
special methods include edge notch cards, peek-a-boo or optical coinci- 
dence and computer data bases. These Utter systems are most helpful 
for coordinate indexing-* that is locating documents wi^b at least two 
specified concepts. 

In a card file, entries are arranged al phabethically under the 
subject terms and sufficient information to be able to locate the 
item is also given. This file method is usqable when the indexing 
method is a thesaurus or classification list. One advantage of the 
card file is it is easy to update. 

The printed list or beok catalog is useful for all indexing 
methods and can vary from an exact copy of an existing card file 
to a computer generated KWIC index or automatic indexing output. An 
advantage to the printed catalog is its portability. It is, however, 
difficult to keep current. 

In an edge notch system cards are used with holes around the edge. 
These holes are assigned subject meaning and each cerd represents a 
document. The holes representing the subject contents of a given . ,j , 



document are notched out. In this method^' 




the sjDecific hole for the subject desired in the stack of document 
cards. Those containing that subject will drop out of the file. 
Coordinate indexing can be accomplished by the use of two or more 
rods. In that instance only those documents with all tha subjects 
sought will drop. 



The peek-a-boo or optical coincidence method is similar to the 
edge notch method. With peek-a-boo indexing each card is representative 
of a subject. On each card is a grid of smill boxes, each box 

representing a document. These boxes are punched out for all documents 

i 

containing a specific subject on that subject card. To locate documents 
on two or more subjects, the card for each subject is pulled from the 
master file. The cairds are held together to the light and any area 
with a hole through all the cards chosen represents a document containing 
those subjects. 

Another related system is Uniterm or terminal digit indesing. Again 
in this system each subject is represented by a card. These cards arc 
divided into 10 columns and the serial numbers for documents on a specified 
subject are entered on that subject card in columns according to the last 
^ digit of their seriajl number. This method makes visual scanning of the 
ERIC subject cards for document retrieval easier. 



Both of these systems break down with collections in excess of 
10,000 items. Their main advantage is the ability to do coordinate 
searching and thereby Increase the precision on searches. 



Because of the extreme spped and large storage capabilitiijs 
of the computer it is able to handle any of the indexing methods 
described. Edge l;notch and peek-a-boo are manual systems recreating . . 
some of the combinatory possibilities of a computer. The only method 
thus far described for both indexing and file organization that the 
computer can^not completely handle is the thesaurus approach. With the 
thesaurus an indexer must chose the term or terms appropriate to describe 
the document. From that point the 'computer canthen process this infer- 
mation, In addition to being able to operate under any of the 
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above mentioned systems and tc give output in the forms prescribed by 
these systems, the conputer is also capable of a file organization of 
its own and will retrieve document references directly from an internal 
master file upon request* 

The number of entry points or terms used to index a document is 
important in its retrievability. By assigning multiple entries, as 
many as are felt necessary, all important subjects represented in an 
item can be indexed, and the item is more easily retrieved. This is 
especially necessary with the multidisciplinary nature of materials 
today and a single subject is no longer sufficient to adequately index 
an item. Most indexing methods allow for multiple entry points. As 
we have said, however, classification schemes can provide only a single 
entry point. 
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APPLICABILITY OF EXISTING INDEXING SYSTB^S TO MEDIA 



Having reviewed the state of indexing^how does the individual 
In an Instructional Media Center with a starting collection determine 
the indexing method to be used J What method of file organization is 
best! Although subject content is independent of form we can see 
some indexing methods are dependent to an extent on printed word 
and are not applicable to purely visual materials* j^n2$o'-far as 
possible each Instructional Media Center should adopt one indexing 
method for all media - print and non-print. 

Of the methods of indexing, the thesaurus or subject words 
approach is the mest satisfactory for dealing with media. Since 
media titles are often less meaningful than book titles KWIC 
indexing poses a problem. Automatic indexing would require word 
input and as stated some media is purely visual. 

The two most commonly used thesauri in libraries are Sears 
List of Subject Headings and Subject Headings Used in the Dictionary 
Catalogs of the Library of Congress . Either of these thesari will 
provide index terms which can be assigned to the content of an item. 
If it is desir^^able to provide in the index, grouping of all items 
on a given subject by form subheadings can be used (Birds Audio ' 
Recordings; Indians Motion Picture Film). The main index file 
for the media center should contain all entries for all materials - - 
print and non-print — for maximum ease of retrieval. Thi^ is more 
important than intershelving and more important than classification. 
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The depth of indexing media can extend to several descriptors 
for a single 2 2 slide to indicate production date, country, and art 
form style, as well as depicted information. Two special indexing 
sct-anes dealing with slides have been designed by Robert Diamond and 
Wendell Simons. Motion picture film can be analysed scene by scene or 
frame by frame in essence reducing the film to a set of slides. Books 
can likewise by indexed chapter by chapter especially in the case of 
collected readings. The decision be made is how many entry points 
should be provided for each item. 

You might have noted that some of the descriptors mentioned in the 
previous paragraph and earlier in this paper are not subject-related. 
Yet descriptors such as date of production or grade level are sometimes 
desireable. The concept of "added entries" used with prinf materials 
can be extended to cover these categories of descriptors. An added 
entry is an additional entry point to the location of a document under 
a- specific name or place. Added entries are used in printed materials 
. for additional authors, or associations which were involved with the 
publication. 
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The file organization method used would generally be a catalog 
card file or printed list* In the Instructional Media Center highly 
specialized retrieval is not usually a requirement and the collections 
are usually of a medium size. For these reasons edge notch and peek-a- 
boo indexing are not recommended. If easy access to a computer is 
available, data base indexes could be considered. 

In addition to the mainf additional supplemental indexes 
can be created for each form of for special subjects. These can 
take the form of additional card files or printed lists. It is 
also possible to set up KWIC indexes for subsets of the collection. 
Some commercially* produced media indexes include Westinghouse Learning 
Corporation Learning Directory ; the National Center for Educational 
Media Index to 16mm Educational Films ; and the library of Congress 
National Union Catalog - Motion Pictures and Filmstrlps . 



SUMMARY 

t 

L 

Media Indexing should be compatible with all other 
indexing done in the Instructional Media Center, as well as 
the same list of terms, with possible additions, should be 
used and the same file organization. Sears or Library of Con- 
gress lists of subject headings would in most cases meet the needs • 
of any center and a total interfiled index (Catalog card file) 
would provide access. 
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